SoulMate: Short-Text Author Linking Through Multi-Aspect Temporal-Textual Embedding
نویسندگان
چکیده
Linking authors of short-text contents has important usages in many applications, including Named Entity Recognition (NER) and human community detection. However, certain challenges lie ahead. First, the input are noisy, ambiguous, do not follow grammatical rules. Second, traditional text mining methods fail to effectively extract concepts through words phrases. Third, textual temporally skewed, which can affect semantic understanding by multiple time facets. Finally, using knowledge-bases make results biased content external database deviate meaning from short corpus. To overcome these challenges, we devise a neural network-based temporal-textual framework that generates subgraphs with highly correlated contents. Our approach, on one hand, computes relevance score (edge weight) between considering portmanteau concepts, other employs stack-wise graph cutting algorithm communities related authors. Experimental show compared knowledge-centered competitors, our multi-aspect vector space model achieve higher performance linking In addition, given author task, more comprehensive dataset is, significance extracted will be.
منابع مشابه
Learning to Author Text with textual CBR
Textual reuse is an integral part of textual case-based reasoning (TCBR) which deals with solving new problems by reusing previous similar problem-solving experiences documented as text. We investigate the role of text reuse for text authoring applications that involve feedback or review generation. Generally providing feedback in the form of assigning a rating from a likert scale is far easier...
متن کاملBundle Optimization for Multi-aspect Embedding
Understanding semantic similarity among images is the core of a wide range of computer graphics and computer vision applications. An important step towards this goal is to collect and learn human perceptions. Interestingly, the semantic context of images is often ambiguous as images can be perceived with emphasis on different aspects, which may be contradictory to each other. In this paper, we ...
متن کاملAuthor Identification: Using Text Mining, Feature Engineering & Network Embedding
Authorship analysis is a challenging area that has been developed through centuries and with research done widely scattered across multiple disciples of mainly computational linguistics, text mining, data mining, stylometry and machine learning. Conventional techniques from the past relied heavily on stylometry and text-based content analysis of document text for authorship analysis. More recen...
متن کاملMulti-Task Label Embedding for Text Classification
Multi-task learning in text classification leverages implicit correlations among related tasks to extract common features and yield performance gains. However, most previous works treat labels of each task as independent and meaningless onehot vectors, which cause a loss of potential information and makes it difficult for these models to jointly learn three or more tasks. In this paper, we prop...
متن کاملAuthor gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2022
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2020.2982148